Prototype references #6433

pmeier · 2022-08-17T08:54:59Z

I don't want to merge this PR. This more like a feature branch that we can discuss on. For the actual port we can either cleanup this PR or use it as a starting point for another.

torchvision/prototype/transforms/__init__.py

datumbox

Looks good overall. Just a few comments. Feel free to ignore if it's too early.

references/classification/train.py

datumbox · 2022-08-17T11:27:25Z

references/classification/train.py

+    if mixup_or_cutmix:
+        batch_transform = transforms.Compose(
+            [
+                WrapIntoFeatures(),


Don't we have to WrapIntoFeatures unconditionally from whether we use mixup/cutmix?

The problem is that default_collate does not respect tensor subclasses. Since we use this transform afterwards, we need to wrap here. Of course we can also wrap before, but it is not necessary since the input is a plain tensor that defaults to images and an integer which is completely ignored.

My question is, what if I don't use mixup or cutmix? Shouldn't we wrap the data into features.Image anyway? I might be missing something here. My main point is that since we are testing the new API, we should probably wrap all inputs using their appropriate types and see how the new kernels behave (Rather than relying on their default/legacy pure tensor implementations).

We can only wrap after we have converted from PIL. This happens fairly late in the transform:

vision/references/classification/presets.py

Line 33 in b83d5f7

transforms.PILToTensor(),

I remember @vfdev-5 noting that on the CPU PIL kernels are faster (I don't remember if there was a special case or other constraints; please fill the blanks). Thus, if we want to optimize for speed, we should probably leave it as is. No strong opinion though.

torchvision/prototype/transforms/__init__.py

pmeier · 2022-08-24T13:01:28Z

I've added the needed changes for the detection references.

references/detection/coco_utils.py

datumbox

@pmeier There seem to be a few issues with the scripts. See below. Might be worth doing a dummy run with very little data to confirm they work.

references/classification/train.py

references/detection/coco_utils.py

pmeier · 2022-09-01T08:01:07Z

I've run the references for a few iterations with the following parameters to confirm they work:

Classification:

[
    "--device=cpu",
    "--batch-size=2",
    "--epochs=1",
    "--workers=2",
    "--mixup-alpha=0.5",
    "--cutmix-alpha=0.5",
    "--auto-augment=ra",  # "ra", "ta_wide", "augmix", "imagenet", "cifar10", "svhn"
    "--random-erase=1.0",
]

Detection:

[
    "--device=cpu",
    "--batch-size=2",
    "--epochs=1",
    "--workers=2",
    "--data-augmentation=hflip",  # "hflip", "lsj", "multiscale", "ssd", "ssdlite"
    # "--use-copypaste",  # if data_augmention == "lsj"
]

datumbox · 2022-11-04T16:48:48Z

Tensor Backend + antialias=True

Reverifying the new API after the speed optimizations. Reference runs at #6433 (comment) and #6433 (comment)

Classification

Augmentation: ta_wide + random erasing + mixup + cutmix

Using githash 959af2d:

PYTHONPATH=$PYTHONPATH:`pwd` python -u run_with_submitit.py --ngpus 8 --nodes 1 --model resnet50 --batch-size 128 --lr 0.5 --lr-scheduler cosineannealinglr --lr-warmup-epochs 5 --lr-warmup-method linear --auto-augment ta_wide --epochs 600 --random-erase 0.1 --label-smoothing 0.1 --mixup-alpha 0.2 --cutmix-alpha 1.0 --weight-decay 0.00002 --norm-weight-decay 0.0 --train-crop-size 176 --model-ema --val-resize-size 232 --ra-sampler --ra-reps 4 --data-path /datasets01_ontap/imagenet_full_size/061417/
# V2 Target Acc: 80.626 / 95.310 - time: 2 days, 6:04:18 - jobid: experiments/PR6433/68029
Submitted job_id: 75703
Test: EMA Acc@1 80.862 Acc@5 95.476
Training time 2 days, 0:12:39

Result: Similar accuracy, 11% faster than unoptimized V2.

Augmentation: aa + random erasing

Using githash 959af2d:

PYTHONPATH=$PYTHONPATH:`pwd` python -u run_with_submitit.py --ngpus 8 --nodes 1 --model mobilenet_v3_small --epochs 600 --opt rmsprop --batch-size 128 --lr 0.064 --wd 0.00001 --lr-step-size 2 --lr-gamma 0.973 --auto-augment imagenet --random-erase 0.2 --data-path /datasets01_ontap/imagenet_full_size/061417/
# V2 Target Acc: 66.044 / 86.338 - time: 2 days, 4:55:57 - jobid: experiments/PR6433/68030
Submitted job_id: 75704
Test:  Acc@1 67.146 Acc@5 87.086
Training time 1 day, 22:05:18

Result: Similar accuracy (improvement not statistically significant), 13% faster than unoptimized V2.

Detection

Augmentation: multiscale

Using githash 959af2d:

PYTHONPATH=$PYTHONPATH:`pwd` python -u run_with_submitit.py --ngpus 8 --nodes 1 --weights-backbone ResNet50_Weights.IMAGENET1K_V2 --dataset coco --model retinanet_resnet50_fpn_v2 --opt adamw --lr 0.0001 --epochs 26 --lr-steps 16 22 --weight-decay 0.05 --norm-weight-decay 0.0 --data-augmentation multiscale --sync-bn --data-path /datasets01_ontap/COCO/022719/
# V2 Target Acc: 0.415 - time: 9:49:04 - jobid: experiments/PR6433/67794
Submitted job_id: 75705
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.413
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.615
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.437
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.273
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.456
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.536
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.337
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.544
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.585
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.434
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.625
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.719
Training time 9:46:33

Result: Similar accuracy and speed.

Augmentation: ssdlite

Using githash 959af2d:

PYTHONPATH=$PYTHONPATH:`pwd` python -u run_with_submitit.py --ngpus 8 --nodes 1 --dataset coco --model ssdlite320_mobilenet_v3_large --aspect-ratio-group-factor 3 --epochs 660 --lr-scheduler cosineannealinglr --lr 0.15 --batch-size 24 --weight-decay 0.00004 --data-augmentation ssdlite --data-path /datasets01_ontap/COCO/022719/
# V2 Target Acc: 0.212 - time: 1 day, 16:06:26 - jobid: experiments/PR6433/67795
Submitted job_id: 75706
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.210
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.341
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.218
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.009
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.198
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.434
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.207
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.304
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.330
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.041
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.338
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.644
Training time 1 day, 14:44:39

Result: Similar accuracy, 8% faster than unoptimized V2.

Augmentation: ssd

Using githash 959af2d:

PYTHONPATH=$PYTHONPATH:`pwd` python -u run_with_submitit.py --ngpus 8 --nodes 1 --weights-backbone VGG16_Weights.IMAGENET1K_FEATURES --dataset coco --model ssd300_vgg16 --epochs 120 --lr-steps 80 110 --aspect-ratio-group-factor 3 --lr 0.002 --batch-size 4 --weight-decay 0.0005 --trainable-backbone-layers 5 --data-augmentation ssd --data-path /datasets01_ontap/COCO/022719/
# V2 Target Acc: 0.252 - time: 17:28:42 - jobid: experiments/PR6433/67796
Submitted job_id: 75707
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.254
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.421
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.264
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.056
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.272
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.437
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.238
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.346
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.367
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.091
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.409
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.599
Training time 16:43:37

Result: Similar accuracy, 4% faster than unoptimized V2.

Augmentation: lsj + copypaste

Using githash 959af2d:

PYTHONPATH=$PYTHONPATH:`pwd` python -u run_with_submitit.py --ngpus 8 --nodes 4 --dataset coco --model maskrcnn_resnet50_fpn_v2 --epochs 600 --lr-steps 540 570 585 --lr 0.32 --batch-size 8 --weight-decay 0.00004 --sync-bn --data-augmentation lsj --use-copypaste --data-path /datasets01_ontap/COCO/022719/
# V2 Target Acc: 0.474 / 0.416 - time: 3 days, 15:09:55 - jobid: experiments/PR6433/67791
Submitted job_id: 75998
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.480
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.682
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.526
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.318
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.516
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.626
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.371
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.592
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.621
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.447
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.657
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.768
IoU metric: segm
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.419
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.655
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.452
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.231
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.447
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.609
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.336
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.526
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.550
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.371
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.589
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.709
Training time 3 days, 14:07:56

Result: Similar accuracy and speed.

Segmentation

Using githash 959af2d:

PYTHONPATH=$PYTHONPATH:`pwd` python -u run_with_submitit.py --ngpus 8 --nodes 1 --dataset coco -b 4 --model lraspp_mobilenet_v3_large --lr 0.01 --wd 0.000001 --weights-backbone MobileNet_V3_Large_Weights.IMAGENET1K_V1
# V2 Target Acc: 90.5 / 54.7 - time: 2:20:45 - jobid: experiments/PR6433/67797
Submitted job_id: 75997
global correct: 90.7
average row correct: ['94.8', '72.7', '63.3', '79.2', '43.5', '28.8', '84.4', '60.7', '84.2', '28.4', '90.5', '52.2', '80.7', '74.2', '85.6', '88.2', '33.3', '67.6', '55.9', '77.9', '59.0']
IoU: ['89.6', '52.9', '59.2', '69.2', '38.5', '24.2', '79.2', '52.2', '71.6', '23.2', '56.8', '35.8', '62.3', '56.3', '72.0', '78.3', '23.1', '57.7', '37.3', '54.6', '54.7']
mean IoU: 54.7
Training time 2:20:49

Result: Similar accuracy and speed.

Video

New Recipe

Using githash 959af2d:

PYTHONPATH=$PYTHONPATH:`pwd` python -u run_with_submitit.py --ngpus 8 --nodes 8 --cache-dataset --batch-size=12 --lr 0.2 --clip-len 64 --clips-per-video 5 --sync-bn --model s3d --auto-augment ta_wide --mixup-alpha 0.8 --cutmix-alpha 1.0 --random-erase 0.25 --train-resize-size 256 320 --train-crop-size 224 224 --val-resize-size 256 256 --val-crop-size 224 224 --data-path="/datasets01_ontap_isolated/kinetics/070618/400/"
# V2 Target Acc: 70.903 / 90.434 - time: 3 days, 6:00:05 - jobid: experiments/PR6433/72508
Submitted job_id: 75701
Training time 3 days, 3:42:15

trainrun torchrun --nproc_per_node=8 train.py --data-path="/datasets01_ontap/kinetics/070618/400/" --batch-size=16 --test-only --cache-dataset --clip-len 128 --clips-per-video 1 --model s3d --val-resize-size 256 256 --val-crop-size 224 224 --resume experiments/PR6433/75701/model_40.pth
 * Clip Acc@1 71.134 Clip Acc@5 90.486

Result: Similar accuracy, the speed is faster. It's a bit harder to estimate improvement because the logs indicate IO slowdown caused by OnTap during at least 3 epochs. The new version is consistently 6-7 minutes per epoch faster than the old, which translates to roughly 7-8% improvement.

This reverts commit 8f07159.

This reverts commit 8b53036.

…ences/classification

pmeier added 2 commits August 17, 2022 10:32

use prototype transforms in classification reference

c19f838

cleanup

7b7602e

facebook-github-bot added the cla signed label Aug 17, 2022

vfdev-5 reviewed Aug 17, 2022

View reviewed changes

torchvision/prototype/transforms/__init__.py Outdated Show resolved Hide resolved

datumbox reviewed Aug 17, 2022

View reviewed changes

pmeier mentioned this pull request Aug 18, 2022

expose AutoAugmentPolicy from torchvision.prototype.transforms #6441

Merged

pmeier added 2 commits August 18, 2022 09:54

Merge branch 'main' into prototype-references/classification

4ea2aaf

move WrapIntoFeatures into transforms module

4990b89

datumbox closed this in #6441 Aug 18, 2022

pmeier reopened this Aug 18, 2022

pmeier added 7 commits August 18, 2022 15:48

Merge branch 'main' into prototype-references/classification

1cfa965

[skip ci] add p=1.0 to CutMix and MixUp

ca4c5a7

Merge branch 'main' into prototype-references/classification

a2e24e1

[skip ci]

693795e

Merge branch 'main' into prototype-references/classification

69f5299

use prototype transforms in detection references

fe96a54

Merge branch 'main' into prototype-references/classification

0f06516

pmeier changed the title ~~Prototype references/classification~~ Prototype references Aug 24, 2022

pmeier commented Aug 24, 2022

View reviewed changes

references/detection/coco_utils.py Show resolved Hide resolved

datumbox mentioned this pull request Aug 26, 2022

Evaluate models accuracy when using inference transforms on Tensors (instead of PIL images) #6506

Closed

pmeier added 6 commits August 26, 2022 16:37

Merge branch 'main' into prototype-references/classification

148885d

[skip ci]

6fd5e50

Merge branch 'main' into prototype-references/classification

6edb7f4

[skip ci]

6fcffb2

[skip ci] Merge branch 'main' into prototype-references/classification

4b68e2f

[skip ci] Merge branch 'main' into prototype-references/classification

4d73fe7

datumbox reviewed Aug 31, 2022

View reviewed changes

references/classification/train.py Outdated Show resolved Hide resolved

references/detection/coco_utils.py Show resolved Hide resolved

[skip ci] fix scripts

7cb08d5

datumbox and others added 9 commits October 14, 2022 15:47

Define _Feature in the dict.

bb468ba

Merge branch 'main' into prototype-references/classification

5f8d233

Handling hot-encoded tensors in accuracy

5928876

Handle ConvertBCHWtoCBHW interactions with mixup/cutmix.

b63e607

Merge branch 'main' into prototype-references/classification

ab141f9

Add Permute Transform.

d5f1532

Merge branch 'main' into prototype-references/classification

707190c

Merge branch 'main' into prototype-references/classification

598542c

Switch to TransposeDimensions

6a0a32c

datumbox mentioned this pull request Oct 21, 2022

[RFC] Batteries Included - Phase 3 #6323

Open

16 tasks

datumbox and others added 7 commits October 26, 2022 16:04

Merge branch 'main' into prototype-references/classification

a59f995

Merge branch 'main' into prototype-references/classification

f72f5b2

Fix linter.

9d0a0a3

Merge branch 'main' into prototype-references/classification

7c41f0c

Merge branch 'main' into prototype-references/classification

87031f1

Merge branch 'main' into prototype-references/classification

7c5da3a

Fix method location.

d8b5202

Fixing minor bug

959af2d

pmeier mentioned this pull request Nov 8, 2022

[FEEDBACK] Transforms V2 API #6753

Closed

datumbox and others added 6 commits November 15, 2022 10:06

Merge branch 'main' into prototype-references/classification

d435378

Merge branch 'main' into prototype-references/classification

bda072d

Convert to floats at the beginning.

8f07159

Revert "Convert to floats at the beginning."

8344ce9

This reverts commit 8f07159.

Switch to PIL backend

8b53036

Revert "Switch to PIL backend"

c7f2ac8

This reverts commit 8b53036.

datumbox mentioned this pull request Nov 17, 2022

Performance improvements for transforms v2 vs. v1 #6818

Closed

31 tasks

pmeier mentioned this pull request Jan 17, 2023

Rollout planning for transforms v2 #7097

Open

Merge branch 'main' of github.com:pytorch/vision into prototype-refer…

f205f1e

…ences/classification

NicolasHug mentioned this pull request Feb 10, 2023

Prototype references #7220

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prototype references #6433

Prototype references #6433

pmeier commented Aug 17, 2022

datumbox left a comment •

edited

Loading

datumbox Aug 17, 2022

pmeier Aug 17, 2022

datumbox Aug 18, 2022

pmeier Aug 18, 2022

pmeier commented Aug 24, 2022

datumbox left a comment

pmeier commented Sep 1, 2022 •

edited

Loading

datumbox commented Nov 4, 2022 •

edited

Loading

Prototype references #6433

Are you sure you want to change the base?

Prototype references #6433

Conversation

pmeier commented Aug 17, 2022

datumbox left a comment • edited Loading

Choose a reason for hiding this comment

datumbox Aug 17, 2022

Choose a reason for hiding this comment

pmeier Aug 17, 2022

Choose a reason for hiding this comment

datumbox Aug 18, 2022

Choose a reason for hiding this comment

pmeier Aug 18, 2022

Choose a reason for hiding this comment

pmeier commented Aug 24, 2022

datumbox left a comment

Choose a reason for hiding this comment

pmeier commented Sep 1, 2022 • edited Loading

datumbox commented Nov 4, 2022 • edited Loading

Tensor Backend + antialias=True

Classification

Augmentation: ta_wide + random erasing + mixup + cutmix

Augmentation: aa + random erasing

Detection

Augmentation: multiscale

Augmentation: ssdlite

Augmentation: ssd

Augmentation: lsj + copypaste

Segmentation

Video

New Recipe

datumbox left a comment •

edited

Loading

pmeier commented Sep 1, 2022 •

edited

Loading

datumbox commented Nov 4, 2022 •

edited

Loading